Language adaptation of multilingual phone models for vocabulary independent speech recognition tasks

نویسنده

  • Joachim Köhler
چکیده

This paper presents our new results on multilingual phone modeling and adaptation into a new target language which is not included in the trained multilingual models. The experiments were carried out with the SpeechDat(M) and MacroPhone databases including the languages French, German, Italian, Portuguese, Spanish and American English. First, we constructed language-dependent and multilingual phone models. The recognition rate for an isolated word task decreased in average only by 3.2% using 95 multilingual instead of 232 language-dependent models. Second, we investigated adaptation techniques for cross-language transfer and showed that only 100 utterances from a new language were needed for adaptation. Using the MAP algorithm the recognition rate was improved from 79.9% to 84.3%. Finally, we defined a phonetic based dissimilarity measure between 2 languages and compared languagedependent and multilingual models for the purpose of crosslanguage transfer.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparing Three Methods to Create Multilingual Phone Models for Vocabulary Independent Speech Recognition Tasks

This paper presents three different methods to develop multilingual phone models for flexible speech recognition tasks. The main goal of our investigations is to find multilingual speech units which work equally well in many languages. With this universal set it is possible to build speech recognition systems for a variety of languages. One advantage of this approach is to share acoustic-phonet...

متن کامل

Continuous speech dictation in French

A major research activity at LIMSI is multilingual, speaker-independent, large vocabulary speech dictation. In this paper we report on efforts in large vocabulary, speaker-independent continuous speech recognition of French using the BREF corpus. Recognition experiments were carried out with vocabularies containing up to 20k words. The recognizer makes use of continuous density HMM with Gaussia...

متن کامل

RWTH LVCSR systems for quaero and EU-bridge: German, Polish, Spanish and Portuguese

In this paper, German, Polish, Spanish, and Portuguese large vocabulary continuous speech recognition (LVCSR) systems developed by the RWTH Aachen University are presented. All the above mentioned systems for the aforementioned languages are used for the Quaero and EU-Bridge project evaluations. The LVCSR systems developed for these competitive evaluations focus on various domains like broadcas...

متن کامل

Adaptation of Pronunciation Dictionaries for Recognition of Unseen Languages

This paper studies the relative effectiveness of different methods for multilingual model combination and dictionary mapping for recognizing a new unseen target language if training data are limited. We examine the crosslanguage transfer from monolingual and multilingual models to German and Russian language for large vocabulary speech recognition using a dictation database which has been colle...

متن کامل

Language independent and language adaptive large vocabulary speech recognition

This paper describes the design of a multilingual speech recognizer using an LVCSR dictation database which has been collected under the project GlobalPhone. This project at the University of Karlsruhe investigates LVCSR systems in 15 languages of the world, namely Arabic, Chinese, Croatian, English, French, German, Italian, Japanese, Korean, Portuguese, Russian, Spanish, Swedish, Tamil, and Tu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Speech Communication

دوره 35  شماره 

صفحات  -

تاریخ انتشار 1998